Constructors
What’s the deal with constructors?
Constructors build objects from dust.
Constructors are like “init functions”. They turn a pile of arbitrary bits into a living object. Minimally they initialize internally used fields. They may also allocate resources (memory, files, semaphores, sockets, etc).
“ctor” is a typical abbreviation for constructor.
Is there any difference between List x;
and List x();
?
A big difference!
Suppose that List
is the name of some class. Then function f()
declares a local List
object called x
:
void f()
{
List x; // Local object named x (of class List)
// ...
}
But function g()
declares a function called x()
that returns a List
:
void g()
{
List x(); // Function named x (that returns a List)
// ...
}
Can one constructor of a class call another constructor of the same class to initialize the this
object?
The answer below applies to Classic (pre-11) C++. This question covers the C++11 feature of constructors that call same-type constructors.
Nope.
Let’s work an example. Suppose you want your constructor Foo::Foo(char)
to call another constructor of the same class,
say Foo::Foo(char,int)
, in order that Foo::Foo(char,int)
would help initialize the this
object. Unfortunately
there’s no way to do this in Classic C++.
Some people do it anyway. Unfortunately it doesn’t do what they want. For example, the line Foo(x, 0);
does not call
Foo::Foo(char,int)
on the this
object. Instead it calls Foo::Foo(char,int)
to initialize a temporary, local object
(not this
), then it immediately destructs that temporary when control flows over the ;
.
class Foo {
public:
Foo(char x);
Foo(char x, int y);
// ...
};
Foo::Foo(char x)
{
// ...
Foo(x, 0); // Does NOT help initialize the this object!!
// ...
}
You can sometimes combine two constructors via a default parameter:
class Foo {
public:
Foo(char x, int y = 0); // Has the effect of combining the two constructors
// ...
};
If that doesn’t work, e.g., if there isn’t an appropriate default parameter that combines the two constructors,
sometimes you can share their common code in a private init()
member function:
class Foo {
public:
Foo(char x);
Foo(char x, int y);
// ...
private:
void init(char x, int y);
};
Foo::Foo(char x)
{
init(x, int(x) + 7);
// ...
}
Foo::Foo(char x, int y)
{
init(x, y);
// ...
}
void Foo::init(char x, int y)
{
// ...
}
BTW do NOT try to achieve this via placement new. Some people think they can say
new(this) Foo(x, int(x)+7)
within the body of Foo::Foo(char)
. However that is bad, bad, bad. Please don’t write me
and tell me that it seems to work on your particular version of your particular compiler; it’s bad. Constructors do a
bunch of little magical things behind the scenes, but that bad technique steps on those partially constructed bits. Just
say no.
Is the default constructor for Fred
always Fred::Fred()
?
No.
A “default constructor” is a constructor that can be called with no arguments. One example of this is a constructor that takes no parameters:
class Fred {
public:
Fred(); // Default constructor: can be called with no args
// ...
};
Another example of a “default constructor” is one that can take arguments, provided they are given default values:
class Fred {
public:
Fred(int i=3, int j=5); // Default constructor: can be called with no args
// ...
};
Which constructor gets called when I create an array of Fred
objects?
Fred
’s default constructor (except as discussed below).
class Fred {
public:
Fred();
// ...
};
int main()
{
Fred a[10]; // Calls the default constructor 10 times
Fred* p = new Fred[10]; // Calls the default constructor 10 times
// ...
}
If your class doesn’t have a default constructor, you’ll get a compile-time error when you attempt to create an array using the above simple syntax:
class Fred {
public:
Fred(int i, int j); // Assume there is no default constructor
// ...
};
int main()
{
Fred a[10]; // ERROR: Fred doesn't have a default constructor
Fred* p = new Fred[10]; // ERROR: Fred doesn't have a default constructor
// ...
}
However, even if your class already has a default constructor, you should try to use std::vector<Fred>
rather
than an array (arrays are evil). std::vector
lets you decide to use any constructor, not just the
default constructor:
#include <vector>
int main()
{
std::vector<Fred> a(10, Fred(5,7)); // The 10 Fred objects in std::vector a will be initialized with Fred(5,7)
// ...
}
Even though you ought to use a std::vector
rather than an array, there are times when an array might be the right
thing to do, and for those, you might need the “explicit initialization of arrays” syntax. Here’s how:
class Fred {
public:
Fred(int i, int j); // Assume there is no default constructor
// ...
};
int main()
{
Fred a[10] = {
Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), // The 10 Fred objects are
Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7), Fred(5,7) // initialized using Fred(5,7)
};
// ...
}
Of course you don’t have to do Fred(5,7)
for every entry — you can put in any numbers you want, even parameters or
other variables.
Finally, you can use placement-new to manually initialize the elements of the array. Warning: it’s
ugly: the raw array can’t be of type Fred
, so you’ll need a bunch of pointer-casts to do things like compute array
index operations. Warning: it’s compiler- and hardware-dependent: you’ll need to make sure the storage is aligned
with an alignment that is at least as strict as is required for objects of class Fred
. Warning: it’s tedious to make
it exception-safe: you’ll need to manually destruct the elements, including in the case when an exception is thrown
part-way through the loop that calls the constructors. But if you really want to do it anyway, read up on
placement-new. (BTW placement-new is the magic that is used inside of std::vector
. The complexity of
getting everything right is yet another reason to use std::vector
.)
By the way, did I ever mention that arrays are evil? Or did I mention that you ought to use a
std::vector
unless there is a compelling reason to use an array?
Should my constructors use “initialization lists” or “assignment”?
Initialization lists. In fact, constructors should initialize as a rule all member objects in the initialization list. One exception is discussed further down.
Watch this space for discussion of Non Static Data Member Initialization in C++11
// Here is the taste of standard C++ NSDMI
struct Point {
int X = 0; // Look at that!!!
int Y = 0; //
};
Consider the following constructor that initializes member object x_
using an initialization list:
Fred::Fred() : x_(
whatever) { }
. The most common benefit of doing this is improved performance. For example, if
the expression whatever is the same type as member variable x_
, the result of the whatever expression is
constructed directly inside x_
— the compiler does not make a separate copy of the object. Even if the types are not
the same, the compiler is usually able to do a better job with initialization lists than with assignments.
The other (inefficient) way to build constructors is via assignment, such as: Fred::Fred() { x_ =
whatever; }
. In
this case the expression whatever causes a separate, temporary object to be created, and this temporary object is
passed into the x_
object’s assignment operator. Then that temporary object is destructed at the ;
. That’s
inefficient.
As if that wasn’t bad enough, there’s another source of inefficiency when using assignment in a constructor: the member object will get fully constructed by its default constructor, and this might, for example, allocate some default amount of memory or open some default file. All this work could be for naught if the whatever expression and/or assignment operator causes the object to close that file and/or release that memory (e.g., if the default constructor didn’t allocate a large enough pool of memory or if it opened the wrong file).
Conclusion: All other things being equal, your code will run faster if you use initialization lists rather than assignment.
Note: There is no performance difference if the type of x_
is some built-in/intrinsic type, such as int
or char*
or float
. But even in these cases, my personal preference is to set those data members in the initialization list
rather than via assignment for consistency. Another symmetry argument in favor of using initialization lists even for
built-in/intrinsic types: non-static const
and non-static reference data members can’t be assigned a value in
the constructor, so for symmetry it makes sense to initialize everything in the initialization list.
Now for the exceptions. Every rule has exceptions (hmmm; does “every rule has exceptions” have exceptions? reminds me of
Gödel’s Incompleteness Theorems), and there are a couple of exceptions to the “use initialization lists” rule. Bottom
line is to use common sense: if it’s cheaper, better, faster, etc. to not use them, then by all means, don’t use them.
This might happen when your class has two constructors that need to initialize the this
object’s data members in
different orders. Or it might happen when two data members are self-referential. Or when a data-member needs a
reference to the this
object, and you want to avoid a compiler warning about using the this
keyword prior to the {
that begins the constructor’s body (when your particular compiler happens to issue that particular warning). Or when you
need to do an if
…throw
test on a variable (parameter, global, etc.) prior to using that
variable to initialize one of your this
members. This list is not exhaustive; please don’t write me asking me to add
another “Or when…”. The point is simply this: use common sense.
How should initializers be ordered in a constructor’s initialization list?
Immediate base classes (left to right), then member objects (top to bottom).
In other words, the order of the initialization list should mimic the order in which initializations will take place. This guideline discourages a particularly subtle class of order dependency errors by giving an obvious, visual clue. For example, the following contains a hideous error.
#include <iostream>
class Y {
public:
Y();
void f();
};
Y::Y() { std::cout << "Initializing Y\n"; }
void Y::f() { std::cout << "Using Y\n"; }
class X {
public:
X(Y& y);
};
X::X(Y& y) { y.f(); }
class Z {
public:
Z();
protected:
X x_;
Y y_;
};
Z::Z() throw()
: y_()
, x_(y_)
↑↑ // Bad: should have listed x_ before y_
{ }
int main()
{
Z z;
return 0;
}
The output of this program follows.
Using Y
Initializing Y
Note that y_
is used (Y::f()
) before it is initialized (Y::Y()
). If instead the programmer had read and abided
by the guideline in this FAQ, the error would be more obvious: the initialization list of Z::Z()
would have read
x_(y_), y_()
, visually indicating that y_
was being used before being initialized.
Not all compilers issue diagnostic messages for these cases. You have been warned.
Is it moral for one member object to be initialized using another member object in the initializer expression?
Yes, but use care and do that only when it adds value.
In a constructor’s initialization list, it is easiest and safest to avoid using one member object from this
object in
the initialization expression of a subsequent initializer for this
object. This guideline prevents subtle
order-dependency errors if someone reorganizes the layout of member objects within the class.
Because of this guideline, the constructor that follows uses s.len_ + 1u
rather than len_ + 1u
, even though they are
otherwise equivalent. The s.
prefix avoids an unnecessary and avoidable order dependency.
#include <memory>
class MyString {
public:
MyString();
~MyString();
MyString(const MyString& s); // copy constructor
MyString& operator= (const MyString& s); // assignment
// ...
protected:
unsigned len_;
char* data_;
};
MyString::MyString()
: len_(0u)
, data_(new char[1])
{
data_[0] = '\0';
}
MyString::~MyString()
{ delete[] data_; }
MyString::MyString(const MyString& s)
: len_ (s.len_)
, data_(new char[s.len_ + 1u]) <--Not {tt{new char[len_+1]}tt}
{ ↑↑↑↑↑↑ // not len_
memcpy(data_, s.data_, len_ + 1u);
} ↑↑↑↑ // no issue using len_ in ctor's {body}
int main()
{
MyString a; // default ctor; zero length MyString ("")
MyString b = a; // copy constructor
return 0;
}
An unnecessary order dependency on the class layout of len_
and data_
would have been introduced if the
constructor’s initialization of data_
had used len_ + 1u
rather than s.len_ + 1u
. However using len_
within a
constructor body ({...}
) is okay. No order dependency is introduced since the entire initialization list is guaranteed
to finish before the constructor body begins executing.
What if one member object has to be initialized using another member object?
Comment the declaration of the effected data members with //ORDER DEPENDENCY
.
If a constructor initializes a member object of this
object using another member object of this
object, rearranging
the data members of the class could break the constructor. This important
maintenance constraint should be documented in the class body.
For example, in the constructor below, the initializer for data_
uses len_
to avoid a redundant call to
std::strlen(s)
, which introduces an order dependency in the class body.
#include <memory>
class MyString {
public:
MyString(const char* s); // promote const char*
MyString(const MyString& s); // copy constructor
MyString& operator= (const MyString&); // assignment
~MyString();
// ...
protected:
unsigned len_; // ORDER DEPENDENCY
char* data_; // ORDER DEPENDENCY
};
MyString::MyString(const char* s)
: len_ (std::strlen(s))
, data_(new char[len_ + 1u])
{
std::memcpy(data_, s, len_ + 1u);
}
MyString::~MyString()
{
delete[] data_;
}
int main()
{
MyString s = "xyzzy";
return 0;
}
Note that the //ORDER DEPENDENCY
comment is listed with the effected data members in the class body, not with the
constructor initialization list where the order dependency was actually created. That is because the order of member
objects in the class body is critical; the order of initializers in the constructor initialization list is
irrelevant.
Should you use the this
pointer in the constructor?
Some people feel you should not use the this
pointer in a constructor because the object is not fully formed yet.
However you can use this
in the constructor (in the {
body}
and even in the initialization list) if
you are careful.
Here is something that always works: the {
body}
of a constructor (or a function called from the constructor) can
reliably access the data members declared in a base class and/or the data members declared in the constructor’s own
class. This is because all those data members are guaranteed to have been fully constructed by the time the
constructor’s {
body}
starts executing.
Here is something that never works: the {
body}
of a constructor (or a function called from the constructor)
cannot get down to a derived class by calling a virtual
member function that is overridden in the derived class. If
your goal was to get to the overridden function in the derived class, you won’t get what you
want. Note that you won’t get to the override in the derived class independent of how
you call the virtual
member function: explicitly using the this
pointer (e.g., this->method()
), implicitly using
the this
pointer (e.g., method()
), or even calling some other function that calls the virtual
member function on
your this
object. The bottom line is this: even if the caller is constructing an object of a derived class, during
the constructor of the base class, your object is not yet of that derived class. You have
been warned.
Here is something that sometimes works: if you pass any of the data members in this
object to another data member’s
initializer, you must make sure that the other data member has already been initialized. The good news is
that you can determine whether the other data member has (or has not) been initialized using some straightforward
language rules that are independent of the particular compiler you’re using. The bad news is that you have to know those
language rules (e.g., base class sub-objects are initialized first (look up the order if you have multiple and/or
virtual
inheritance!), then data members defined in the class are initialized in the order in which they appear in the
class declaration). If you don’t know these rules, then don’t pass any data member from the this
object (regardless of
whether or not you explicitly use the this
keyword) to any other data member’s initializer! And if you
do know the rules, please be careful.
What is the “Named Constructor Idiom”?
A technique that provides more intuitive and/or safer construction operations for users of your class.
The problem is that constructors always have the same name as the class. Therefore the only way to differentiate between the various constructors of a class is by the parameter list. But if there are lots of constructors, the differences between them become somewhat subtle and error prone.
With the Named Constructor Idiom, you declare all the class’s constructors in the private
or protected
sections, and
you provide public
static
methods that return an object. These static
methods are the so-called “Named
Constructors.” In general there is one such static
method for each different way to construct an object.
For example, suppose we are building a Point
class that represents a position on the X-Y plane. Turns out there are
two common ways to specify a 2-space coordinate: rectangular coordinates (X+Y), polar coordinates (Radius+Angle).
(Don’t worry if you can’t remember these; the point isn’t the particulars of coordinate systems; the point is that there
are several ways to create a Point
object.) Unfortunately the parameters for these two coordinate systems are the
same: two float
s. This would create an ambiguity error in the overloaded constructors:
class Point {
public:
Point(float x, float y); // Rectangular coordinates
Point(float r, float a); // Polar coordinates (radius and angle)
// ERROR: Overload is Ambiguous: Point::Point(float,float)
};
int main()
{
Point p = Point(5.7, 1.2); // Ambiguous: Which coordinate system?
// ...
}
One way to solve this ambiguity is to use the Named Constructor Idiom:
#include <cmath> // To get std::sin() and std::cos()
class Point {
public:
static Point rectangular(float x, float y); // Rectangular coord's
static Point polar(float radius, float angle); // Polar coordinates
// These static methods are the so-called "named constructors"
// ...
private:
Point(float x, float y); // Rectangular coordinates
float x_, y_;
};
inline Point::Point(float x, float y)
: x_(x), y_(y) { }
inline Point Point::rectangular(float x, float y)
{ return Point(x, y); }
inline Point Point::polar(float radius, float angle)
{ return Point(radius*std::cos(angle), radius*std::sin(angle)); }
Now the users of Point
have a clear and unambiguous syntax for creating Point
s in either coordinate system:
int main()
{
Point p1 = Point::rectangular(5.7, 1.2); // Obviously rectangular
Point p2 = Point::polar(5.7, 1.2); // Obviously polar
// ...
}
Make sure your constructors are in the protected
section if you expect Point
to have derived classes.
The Named Constructor Idiom can also be used to make sure your objects are always created via
new
.
Note that the Named Constructor Idiom, at least as implemented above, is just as fast as directly calling a constructor — modern compilers will not make any extra copies of your object.
Does return-by-value mean extra copies and extra overhead?
Not necessarily.
All(?) commercial-grade compilers optimize away the extra copy, at least in cases as illustrated in the previous FAQ.
To keep the example clean, let’s strip things down to the bare essentials. Suppose function caller()
calls rbv()
(“rbv” stands for “return by value”) which returns a Foo
object by value:
class Foo { /*...*/ };
Foo rbv();
void caller()
{
Foo x = rbv(); // The return-value of rbv() goes into x
// ...
}
Now the question is, How many Foo
objects will there be? Will rbv()
create a temporary Foo
object that gets
copy-constructed into x
? How many temporaries? Said another way, does return-by-value necessarily degrade
performance?
The point of this FAQ is that the answer is No, commercial-grade C++ compilers implement return-by-value in a way that lets them eliminate the overhead, at least in simple cases like those shown in the previous FAQ. In particular, all(?) commercial-grade C++ compilers will optimize this case:
Foo rbv()
{
// ...
return Foo(42, 73); // Suppose Foo has a ctor Foo::Foo(int a, int b)
}
Certainly the compiler is allowed to create a temporary, local Foo
object, then copy-construct that temporary into
variable x
within caller()
, then destruct the temporary. But all(?) commercial-grade C++ compilers won’t do
that: the return
statement will directly construct x
itself. Not a copy of x
, not a pointer to x
, not a
reference to x
, but x
itself.
You can stop here if you don’t want to genuinely understand the previous paragraph, but if you want to know the secret
sauce (so you can, for example, reliably predict when the compiler can and cannot provide that optimization for you),
the key is to know that compilers usually implement return-by-value using pass-by-pointer. When caller()
calls
rbv()
, the compiler secretly passes a pointer to the location where rbv()
is supposed to construct the “returned”
object. It might look something like this (it’s shown as a void*
rather than a Foo*
since the Foo
object has not
yet been constructed):
// Pseudo-code
void rbv(void* put_result_here) // Original C++ code: Foo rbv()
{
// ...code that initializes (not assigns to) the variable pointed to by put_result_here
}
// Pseudo-code
void caller()
{
// Original C++ code: Foo x = rbv()
struct Foo x; // Note: x does not get initialized prior to calling rbv()
rbv(&x); // Note: rbv() initializes a local variable defined in caller()
// ...
}
So the first ingredient in the secret sauce is that the compiler (usually) transforms return-by-value into
pass-by-pointer. This means that commercial-grade compilers don’t bother creating a temporary: they directly
construct the returned object in the location pointed to by put_result_here
.
The second ingredient in the secret sauce is that compilers typically implement constructors using a similar technique.
This is compiler-dependent and somewhat idealized (I’m intentionally ignoring how to handle new
and overloading), but
compilers typically implement Foo::Foo(int a, int b)
using something like this:
// Pseudo-code
void Foo_ctor(Foo* this, int a, int b) // Original C++ code: Foo::Foo(int a, int b)
{
// ...
}
Putting these together, the compiler might implement the return
statement in rbv()
by simply passing
put_result_here
as the constructor’s this
pointer:
// Pseudo-code
void rbv(void* put_result_here) // Original C++ code: Foo rbv()
{
// ...
Foo_ctor((Foo*)put_result_here, 42, 73); // Original C++ code: return Foo(42,73);
return;
}
So caller()
passes &x
to rbv()
, and rbv()
in turn passes &x
to the constructor (as the this
pointer). That
means constructor directly constructs x
.
In the early 90s I did a seminar for IBM’s compiler group in Toronto, and one of their engineers told me that they found this return-by-value optimization to be so fast that you get it even if you don’t compile with optimization turned on. Because the return-by-value optimization causes the compiler to generate less code, it actually improves compile-times in addition to making your generated code smaller and faster. The point is that the return-by-value optimization is almost universally implemented, at least in code cases like those shown above.
Final thought: this discussion was limited to whether there will be any extra copies of the returned object in a
return-by-value call. Don’t confuse that with other things that could happen in caller()
. For example, if you
changed caller()
from Foo x = rbv();
to Foo x;
x = rbv();
(note the ;
after the declaration), the compiler is
required to use Foo
’s assignment operator, and unless the compiler can prove that Foo
’s default constructor followed
by assignment operator is exactly the same as its copy constructor, the compiler is required by the language to put the
returned object into an unnamed temporary within caller()
, use the assignment operator to copy the temporary into x
,
then destruct the temporary. The return-by-value optimization still plays its part since there will be only one
temporary, but by changing Foo x = rbv();
to Foo x;
x = rbv();
, you have prevented the compiler from eliminating
that last temporary.
What about returning a local variable by value? Does the local exist as a separate object, or does it get optimized away?
When your code returns a local variable by value, your compiler might optimize away the local variable completely - zero space-cost and zero time-cost - the local variable never actually exists as a distinct object from the caller’s target variable (see below for specifics about exactly what this means). Other compilers do not optimize it away.
These are some(!) of the compilers that optimize away the local variable completely:
- GNU C++ (g++) since at least version 3.3.3
- (Others need to be added; need more info)
These are some(!) of the compilers that do not optimize away the local variable:
- Microsoft Visual C++.NET 2003
- (Others need to be added; need more info)
Here is an example showing what we mean in this FAQ:
class Foo {
public:
Foo(int a, int b);
void some_method();
// ...
};
void do_something_with(Foo& z);
Foo rbv()
{
Foo y = Foo(42, 73);
y.some_method();
do_something_with(y);
return y;
}
void caller()
{
Foo x = rbv();
// ...
}
The question addressed in this FAQ is this: How many Foo
objects actually get created in the runtime system?
Conceptually there could be as many as three distinct objects: the temporary created by Foo(42, 73)
, variable y
(in
rbv()
), and variable x
(in caller()
). However as we saw earlier most compilers merge Foo(42, 73)
and variable
y
into the same object, reducing the total number of objects from 3 to 2. But this FAQ
pushes it one step further: does y
(in rbv()
) show up as a distinct, runtime object from x
(in caller()
)?
Some compilers, including but not limited to those listed above, completely optimize away local variable y
. In those
compilers, there is only one Foo
object in the above code: caller()
’s variable x
is exactly identically the same
object as rbv()
’s variable y
.
They do this the same way as described earlier: the return-by-value in function
rbv()
is implemented as pass-by-pointer, where the pointer points to the location where the returned object is to be
initialized.
So instead of constructing y
as a local object, these compilers simply construct *put_result_here
, and everytime
they see variable y
used in the original source code, they substitute *put_result_here
instead. Then the line
return y;
becomes simply return;
since the returned object has already been constructed in the location designated
by the caller.
Here is the resulting (pseudo)code:
// Pseudo-code
void rbv(void* put_result_here) // Original C++ code: Foo rbv()
{
Foo_ctor((Foo*)put_result_here, 42, 73); // Original C++ code: Foo y = Foo(42,73);
Foo_some_method(*(Foo*)put_result_here); // Original C++ code: y.some_method();
do_something_with((Foo*)put_result_here); // Original C++ code: do_something_with(y);
return; // Original C++ code: return y;
}
void caller()
{
struct Foo x; // Note: x is not initialized here!
rbv(&x); // Original C++ code: Foo x = rbv();
// ...
}
Caveat: this optimization can be applied only when all a function’s return
statements return the same local
variable. If one return
statement in rbv()
returned local variable y
but another returned something else, such as
a global or a temporary, the compiler could not alias the local variable into the caller’s destination, x
. Verifying
that all the function’s return statements return the same local variable requires extra work on the part of the compiler
writers, which is usually why some compilers fail to implement that return-local-by-value optimization.
Final thought: this discussion was limited to whether there will be any extra copies of the returned object in a
return-by-value call. Don’t confuse that with other things that could happen in caller()
. For example, if you
changed caller()
from Foo x = rbv();
to Foo x;
x = rbv();
(note the ;
after the declaration), the compiler is
required to use Foo
’s assignment operator, and unless the compiler can prove that Foo
’s default constructor followed
by assignment operator is exactly the same as its copy constructor, the compiler is required by the language to put the
returned object into an unnamed temporary within caller()
, use the assignment operator to copy the temporary into x
,
then destruct the temporary. The return-by-value optimization still plays its part since there will be only one
temporary, but by changing Foo x = rbv();
to Foo x;
x = rbv();
, you have prevented the compiler from eliminating
that last temporary.
Why can’t I initialize my static
member data in my constructor’s initialization list?
Because you must explicitly define your class’s static
data members.
Fred.h
:
class Fred {
public:
Fred();
// ...
private:
int i_;
static int j_;
};
Fred.cpp
(or Fred.C
or whatever):
Fred::Fred()
: i_(10) // Okay: you can (and should) initialize member data this way
, j_(42) // Error: you cannot initialize static member data like this
{
// ...
}
// You must define static data members this way:
int Fred::j_ = 42;
Note: in some cases, the definition of Fred::j_
might not contain the =
initializer part. For details, see
here and here.
Why are classes with static
data members getting linker errors?
Because static
data members must be explicitly defined in exactly one compilation
unit. If you didn’t do this, you’ll probably get an "undefined external"
linker
error. For example:
// Fred.h
class Fred {
public:
// ...
private:
static int j_; // Declares static data member Fred::j_
// ...
};
The linker will holler at you ("Fred::j_ is not defined"
) unless you define (as opposed to merely declare) Fred::j_
in (exactly) one of your source files:
// Fred.cpp
#include "Fred.h"
int Fred::j_ = some_expression_evaluating_to_an_int;
// Alternatively, if you wish to use the implicit 0 value for static ints:
// int Fred::j_;
The usual place to define static
data members of class
Fred
is file Fred.cpp
(or Fred.C
or whatever source
file extension you use).
Note: in some cases, you can add =
initializer;
to the declaration of class-scope static
declarations, however if you ever use the data member, you still need to explicitly
define it in exactly one compilation unit. In this case you don’t include an =
initializer in the definition. A
separate FAQ covers this topic.
Can I add =
initializer;
to the declaration of a class-scope static
const
data member?
Yes, though with some important caveats.
Before going through the caveats, here is a simple example that is allowed:
// Fred.h
class Fred {
public:
static const int maximum = 42;
// ...
};
And, just like other static
data members, it must be defined in exactly one compilation
unit, though this time without the =
initializer part:
// Fred.cpp
#include "Fred.h"
const int Fred::maximum;
// ...
The caveats are that you may do this only with integral or enumeration types, and that the initializer expression must
be an expression that can be evaluated at compile-time: it must only contain other constants, possibly combined with
built-in operators. For example, 3*4
is a compile-time constant expression, as is a*b
provided a
and b
are
compile-time constants. After the declaration above, Fred::maximum
is also a compile-time constant: it can be used
in other compile-time constant expressions.
If you ever take the address of Fred::maximum
, such as passing it by reference or explicitly saying &Fred::maximum
,
the compiler will make sure it has a unique address. If not, Fred::maximum
won’t even take up space in your process’s
static data area.
What’s the “static
initialization order ‘fiasco’ (problem)”?
A subtle way to crash your program.
The static
initialization order problem is a very subtle and commonly misunderstood aspect of C++. Unfortunately
it’s very hard to detect — the errors often occur before main()
begins.
In short, suppose you have two static
objects x
and y
which exist in separate source files, say x.cpp
and
y.cpp
. Suppose further that the initialization for the y
object (typically the y
object’s constructor) calls some
method on the x
object.
That’s it. It’s that simple.
The tough part is that you have a 50%-50% chance of corrupting the program. If the compilation unit for x.cpp
happens to get initialized
first, all is well. But if the compilation unit for y.cpp
get initialized first, then y
’s initialization will get
run before x
’s initialization, and you’re toast. E.g., y
’s constructor could call a method on the x
object, yet
the x
object hasn’t yet been constructed.
For how to address the problem, see the next FAQ.
Note: The static initialization order problem can also, in some cases, apply to built-in/intrinsic types.
How do I prevent the “static
initialization order problem”?
To prevent the static initialization order problem, use the Construct On First Use Idiom, described below.
The basic idea of the Construct On First Use Idiom is to wrap your static
object inside a function. For example,
suppose you have two classes, Fred
and Barney
. There is a namespace-scope / global Fred
object called x
, and a
namespace-scope / global Barney
object called y
. Barney
’s constructor invokes the goBowling()
method on the x
object. The file x.cpp
defines the x
object:
// File x.cpp
#include "Fred.h"
Fred x;
The file y.cpp
defines the y
object:
// File y.cpp
#include "Barney.h"
Barney y;
For completeness the Barney
constructor might look something like this:
// File Barney.cpp
#include "Barney.h"
Barney::Barney()
{
// ...
x.goBowling();
// ...
}
You would have a static
initialization disaster if y
got constructed before x
. As written
above, this disaster would occur roughly 50% of the time, since the two objects are declared in different source files
and those source files give no hints to the compiler or linker as to the order of static initialization.
There are many solutions to this problem, but a very simple and completely portable solution is the Construct On First
Use Idiom: replace the namespace-scope / global Fred
object x
with a namespace-scope / global function x()
that returns the Fred
object by reference.
// File x.cpp
#include "Fred.h"
Fred& x()
{
static Fred* ans = new Fred();
return *ans;
}
Since static
local objects are constructed the first time control flows over their declaration (only), the above
new Fred()
statement will only happen once: the first time x()
is called. Every subsequent call will return the
same Fred
object (the one pointed to by ans
). Then all you do is change your usages of x
to x()
:
// File Barney.cpp
#include "Barney.h"
Barney::Barney()
{
// ...
x().goBowling();
// ...
}
This is called the Construct On First Use Idiom because it does just that: the (logically namespace-scope / global)
Fred
object is constructed on its first use.
The downside of this approach is that the Fred
object is never destructed. If the Fred
object has a destructor with
important side effects, there is another technique that answers this concern; but it needs
to be used with care since it creates the possibility of another (equally nasty) problem.
Note: The static initialization order problem can also, in some cases, apply to built-in/intrinsic types.
Why doesn’t the Construct On First Use Idiom use a static
object instead of a static
pointer?
Short answer: it’s possible to use a static object rather than a static pointer, but doing so opens up another (equally subtle, equally nasty) problem.
Long answer: sometimes people worry about the fact that the previous solution
“leaks.” In many cases, this is not a problem, but it is a problem in some cases. Note: even though the object pointed
to by ans
in the previous FAQ is never deleted, the memory doesn’t actually “leak” when the program exits since the
operating system automatically reclaims all the memory in a program’s heap when that program exits. In other words, the
only time you’d need to worry about this is when the destructor for the Fred
object performs some important action
(such as writing something to a file) that must occur sometime while the program is exiting.
In those cases where the construct-on-first-use object (the Fred
, in this case) needs to eventually get
destructed, you might consider changing function x()
as follows:
// File x.cpp
#include "Fred.h"
Fred& x()
{
static Fred ans; // was static Fred* ans = new Fred();
return ans; // was return *ans;
}
However there is (or rather, may be) a rather subtle problem with this change. To understand this potential problem, let’s remember why we’re doing all this in the first place: we need to make 100% sure our static object (a) gets constructed prior to its first use and (b) doesn’t get destructed until after its last use. Obviously it would be a disaster if any static object got used either before construction or after destruction. The message here is that you need to worry about two situations (static initialization and static deinitialization), not just one.
By changing the declaration from static Fred* ans = new Fred();
to static Fred ans;
, we still correctly handle the
initialization situation but we no longer handle the deinitialization situation. For example, if there are 3 static
objects, say a
, b
and c
, that use ans
during their destructors, the only way to avoid a static deinitialization
disaster is if ans
is destructed after all three.
The point is simple: if there are any other static objects whose destructors might use ans
after ans
is destructed,
bang, you’re dead. If the constructors of a
, b
and c
use ans
, you should normally be okay since the runtime
system will, during static deinitialization, destruct ans
after the last of those three objects is destructed. However
if a
and/or b
and/or c
fail to use ans
in their constructors and/or if any code anywhere gets the address of
ans
and hands it to some other static object, all bets are off and you have to be very, very careful.
There is a third approach that handles both the static initialization and static deinitialization situations, but it has other non-trivial costs.
What is a technique to guarantee both static
initialization and static
deinitialization?
Short answer: use the Nifty Counter Idiom (but make sure you understand the non-trivial tradeoffs!).
Motivation:
- The Construct On First Use Idiom uses a pointer and intentionally leaks the object. That is often innocuous, since the operating system will typically clean up a process’s memory when the process terminates. However if the object has a non-trivial destructor with important side effects, such as writing to a file or some other non-volatile action, then you need more.
- That’s where the second version of the Construct On First Use Idiom came in: it doesn’t leak the object, but it does not control the order of static deinitialization, so it is (very!) unsafe to use the object during static deinitialization, that is, from a destructor of another statically declared object.
- If you need to control the order of both static initialization and static deinitialization, meaning if you wish to access a statically allocated object from both constructors and destructors of other static objects, then keep reading.
- Otherwise run away.
TODO: WRITE THIS UP
TODO: WRITE UP TRADEOFFS — now that you know how to use the Nifty Counter Idiom, be sure you understand both when and (especially!) when not to use it! One size does not fit all.
How do I prevent the “static
initialization order problem” for my static
data members?
Use the Construct Members On First Use Idiom, which is basically the same as the regular Construct On First Use
Idiom, or perhaps one of its variants, but it uses a
static
member function instead of a namespace-scope / global function.
Suppose you have a class X
that has a static
Fred
object:
// File X.h
class X {
public:
// ...
private:
static Fred x_;
};
Naturally this static
member is initialized separately:
// File X.cpp
#include "X.h"
Fred X::x_;
Naturally also the Fred
object will be used in one or more of X
’s methods:
void X::someMethod()
{
x_.goBowling();
}
But now the “disaster scenario” is if someone somewhere somehow calls this method before the Fred
object gets
constructed. For example, if someone else creates a static X
object and invokes its someMethod()
method during
static
initialization, then you’re at the mercy of the compiler as to whether the compiler will construct X::x_
before or after the someMethod()
is called. (Note that the ANSI/ISO C++ committee is working on this problem, but
compilers aren’t yet generally available that handle these changes; watch this space for an update in the future.)
In any event, it’s always portable and safe to change the X::x_
static
data member into a static
member function:
// File X.h
class X {
public:
// ...
private:
static Fred& x();
};
Naturally this static
member is initialized separately:
// File X.cpp
#include "X.h"
Fred& X::x()
{
static Fred* ans = new Fred();
return *ans;
}
Then you simply change any usages of x_
to x()
:
void X::someMethod()
{
x().goBowling();
}
If you’re super performance sensitive and you’re concerned about the overhead of an extra function call on each
invocation of X::someMethod()
you can set up a static
Fred&
instead. As you recall, static
local are only
initialized once (the first time control flows over their declaration), so this will call X::x()
only once: the first
time X::someMethod()
is called:
void X::someMethod()
{
static Fred& x = X::x();
x.goBowling();
}
Note: The static initialization order problem can also, in some cases, apply to built-in/intrinsic types.
Do I need to worry about the “static
initialization order problem” for variables of built-in/intrinsic types?
Yes.
If you initialize your built-in/intrinsic type using a function call, the static initialization order problem is able to kill you just as bad as with user-defined/class types. For example, the following code shows the failure:
#include <iostream>
int f(); // forward declaration
int g(); // forward declaration
int x = f();
int y = g();
int f()
{
std::cout << "using 'y' (which is " << y << ")\n";
return 3*y + 7;
}
int g()
{
std::cout << "initializing 'y'\n";
return 5;
}
The output of this little program will show that it uses y
before initializing it. The solution, as before, is the
Construct On First Use Idiom:
#include <iostream>
int f(); // forward declaration
int g(); // forward declaration
int& x()
{
static int ans = f();
return ans;
}
int& y()
{
static int ans = g();
return ans;
}
int f()
{
std::cout << "using 'y' (which is " << y() << ")\n";
return 3*y() + 7;
}
int g()
{
std::cout << "initializing 'y'\n";
return 5;
}
Of course you might be able to simplify this by moving the initialization code for x
and y
into their respective
functions:
#include <iostream>
int& y(); // forward declaration
int& x()
{
static int ans;
static bool firstTime = true;
if (firstTime) {
firstTime = false;
std::cout << "using 'y' (which is " << y() << ")\n";
ans = 3*y() + 7;
}
return ans;
}
int& y()
{
static int ans;
static bool firstTime = true;
if (firstTime) {
firstTime = false;
std::cout << "initializing 'y'\n";
ans = 5;
}
return ans;
}
And, if you can get rid of the print statements you can further simplify these to something really simple:
int& y(); // forward declaration
int& x()
{
static int ans = 3*y() + 7;
return ans;
}
int& y()
{
static int ans = 5;
return ans;
}
Furthermore, since y
is initialized using a constant expression, it no longer needs its wrapper function — it can be
a simple variable again.
How can I handle a constructor that fails?
Throw an exception. For details, see here.
What is the “Named Parameter Idiom”?
It’s a fairly useful way to exploit method chaining.
The fundamental problem solved by the Named Parameter Idiom is that C++ only supports positional parameters. For
example, a caller of a function isn’t allowed to say, “Here’s the value for formal parameter xyz
, and this other thing
is the value for formal parameter pqr
.” All you can do in C++ (and C and Java) is say, “Here’s the first parameter,
here’s the second parameter, etc.” The alternative, called named parameters and implemented in the language Ada, is
especially useful if a function takes a large number of mostly default-able parameters.
Over the years people have cooked up lots of workarounds for the lack of named parameters in C and C++. One of these
involves burying the parameter values in a string parameter then parsing this string at run-time. This is what’s done
in the second parameter of fopen()
, for example. Another workaround is to combine all the boolean parameters in a
bit-map, then the caller or’s a bunch of bit-shifted constants together to produce the actual parameter. This is
what’s done in the second parameter of open()
, for example. These approaches work, but the following technique
produces caller-code that’s more obvious, easier to write, easier to read, and is generally more elegant.
The idea, called the Named Parameter Idiom, is to change the function’s parameters to methods of a newly created class,
where all these methods return *this
by reference. Then you simply rename the main function into a parameterless
“do-it” method on that class.
We’ll work an example to make the previous paragraph easier to understand.
The example will be for the “open a file” concept. Let’s say that concept logically requires a parameter for the file’s name, and optionally allows parameters for whether the file should be opened read-only vs. read-write vs. write-only, whether or not the file should be created if it doesn’t already exist, whether the writing location should be at the end (“append”) or the beginning (“overwrite”), the block-size if the file is to be created, whether the I/O is buffered or non-buffered, the buffer-size, whether it is to be shared vs. exclusive access, and probably a few others. If we implemented this concept using a normal function with positional parameters, the caller code would be very difficult to read: there’d be as many as 8 positional parameters, and the caller would probably make a lot of mistakes. So instead we use the Named Parameter Idiom.
Before we go through the implementation, here’s what the caller code might look like, assuming you are willing to accept all the function’s default parameters:
File f = OpenFile("foo.txt");
That’s the easy case. Now here’s what it might look like if you want to change a bunch of the parameters.
File f = OpenFile("foo.txt")
.readonly()
.createIfNotExist()
.appendWhenWriting()
.blockSize(1024)
.unbuffered()
.exclusiveAccess();
Notice how the “parameters”, if it’s fair to call them that, are in random order (they’re not positional) and they all have names. So the programmer doesn’t have to remember the order of the parameters, and the names are (hopefully) obvious.
So here’s how to implement it: first we create a class (OpenFile
) that houses all the parameter values as private
data members. The required parameters (in this case, the only required parameter is the file’s name) is implemented as a
normal, positional parameter on OpenFile
’s constructor, but that constructor doesn’t actually open the file. Then all
the optional parameters (readonly vs. readwrite, etc.) become methods. These methods (e.g., readonly()
,
blockSize(unsigned)
, etc.) return a reference to their this
object so the method calls can be
chained.
class File;
class OpenFile {
public:
OpenFile(const std::string& filename);
// sets all the default values for each data member
OpenFile& readonly(); // changes readonly_ to true
OpenFile& readwrite(); // changes readonly_ to false
OpenFile& createIfNotExist();
OpenFile& blockSize(unsigned nbytes);
// ...
private:
friend class File;
std::string filename_;
bool readonly_; // defaults to false [for example]
bool createIfNotExist_; // defaults to false [for example]
// ...
unsigned blockSize_; // defaults to 4096 [for example]
// ...
};
inline OpenFile::OpenFile(const std::string& filename)
: filename_ (filename)
, readonly_ (false)
, createIfNotExist_ (false)
, blockSize_ (4096u)
{ }
inline OpenFile& OpenFile::readonly()
{ readonly_ = true; return *this; }
inline OpenFile& OpenFile::readwrite()
{ readonly_ = false; return *this; }
inline OpenFile& OpenFile::createIfNotExist()
{ createIfNotExist_ = true; return *this; }
inline OpenFile& OpenFile::blockSize(unsigned nbytes)
{ blockSize_ = nbytes; return *this; }
The only other thing to do is make the constructor for class File
to take an OpenFile
object:
class File {
public:
File(const OpenFile& params);
// ...
};
This constructor gets the actual parameters from the OpenFile object, then actually opens the file:
File::File(const OpenFile& params)
{
// ...
}
Note that OpenFile
declares File
as its friend
, that way OpenFile
doesn’t need a bunch of (otherwise
useless) public:
get methods.
Since each member function in the chain returns a reference, there is no copying of objects and the chain is highly
efficient. Furthermore, if the various member functions are inline
, the generated object code will probably be on par
with C-style code that sets various members of a struct
. Of course if the member functions are not inline
, there
may be a slight increase in code size and a slight decrease in performance (but only if the construction occurs on the
critical path of a CPU-bound program; this is a can of worms I’ll try to avoid opening), so it may, in this case, be a
tradeoff for making the code more reliable.
Why am I getting an error after declaring a Foo
object via Foo x(Bar())
?
Because that doesn’t create a Foo
object - it declares a non-member function that returns a Foo
object. The term “Most Vexing Parse” was coined by Scott Myers to describe this situation.
This is really going to hurt; you might want to sit down.
First, here’s a better explanation of the problem. Suppose there is a class called Bar
that has a default ctor. This
might even be a library class such as std::string
, but for now we’ll just call it Bar
:
class Bar {
public:
Bar();
// ...
};
Now suppose there’s another class called Foo
that has a ctor that takes a Bar
. As before, this might be defined by someone other than you.
class Foo {
public:
Foo(const Bar& b); // or perhaps Foo(Bar b)
// ...
void blah();
// ...
};
Now you want to create a Foo
object using a temporary Bar
. In other words, you want to create an object via Bar()
,
and pass that to the Foo
ctor to create a local Foo
object called x
:
void yourCode()
{
Foo x(Bar()); // You think this creates a Foo object called x...
x.blah(); // ...But it doesn't, so this line gives you a bizarre error message
// ...
}
It’s a long story, but one solution (hope you’re sitting down!) is to add an extra pair of ()
s around the Bar()
part:
void yourCode()
{
Foo x((Bar()));
↑ ↑ // These parens save the day
x.blah();
↑↑↑↑↑↑↑↑ // Ahhhh, this now works: no more error messages
// ...
}
Another solution is to use =
in your declaration (see the fine print below):
void yourCode()
{
Foo x = Foo(Bar()); // Yes, Virginia, that thar syntax works; see below for fine print
x.blah(); // Ahhhh, this now works: no more error messages
// ...
}
Note: The above solution requires yourCode()
to be able to access the Foo
copy constructor. In most situations that means the Foo
copy constructor needs to be public
, though it needn’t be public
in the less common case where yourCode()
is a friend of class Foo
. If you’re not sure what any of that means, try it: if your code compiles, you passed the test.
Here’s another solution (more fine print below):
void yourCode()
{
Foo x = Bar(); // Usually works; see below for fine print on "usually"
x.blah();
// ...
}
Note: The word “usually” in the above means this: the above fails only when Foo::Foo(const Bar&)
constructor is explicit
, or when Foo
’s copy constructor is inaccessible (typically when it is private
or protected
, and your code is not a friend
). If you’re not sure what any of that means, take 60 seconds and compile it. You are guaranteed to find out whether it works or fails at compile-time, so if it compiles cleanly, it will work
at runtime.
However, the best solution, the creation of which was at least partially motivated by the fact that this FAQ exists, is to use uniform initialization, which replaces the ()
around the Bar()
call with {}
instead.
void yourCode()
{
Foo x{Bar()};
x.blah(); // Ahhhh, this now works: no more error messages
// ...
}
That’s the end of the solutions; the rest of this is about why this is needed (this is optional; you can skip this section if you don’t care enough about your career to actually understand what’s going on; ha ha): When the compiler sees Foo x(Bar())
, it thinks that the Bar()
part is declaring a non-member function that returns a Bar
object, so it thinks you are declaring the existence of a function called x
that returns a Foo
and that takes as a single parameter of type “non-member function that takes nothing and returns a Bar
.”
Now here’s the sad part. In fact it’s pathetic. Some mindless drone out there is going to skip that last paragraph, then they’re going to impose a bizarre, incorrect, irrelevant, and just plain stupid coding standard that says something like, “Never create temporaries using a default constructor” or “Always use =
in all initializations” or something else equally inane. If that’s you, please fire yourself before you do any more damage. Those who don’t understand the problem shouldn’t tell others how to solve it. Harumph.
(That was mostly tongue in cheek. But there’s a grain of truth in it. The real problem is that people tend to worship consistency, and they tend to extrapolate from the obscure to the common. That’s not wise.)
What is the purpose of the explicit
keyword?
The explicit
keyword is an optional decoration for constructors and conversion operators to tell the compiler that a certain constructor or conversion operator may not be used to implicitly cast an expression to its class type.
For example, without the explicit
keyword the following code is valid:
class Foo {
public:
Foo(int x);
operator int();
};
class Bar {
public:
Bar(double x);
operator double();
};
void yourCode()
{
Foo a = 42; // Okay: calls Foo::Foo(int) passing 42 as an argument
Foo b(42); // Okay: calls Foo::Foo(int) passing 42 as an argument
Foo c = Foo(42); // Okay: calls Foo::Foo(int) passing 42 as an argument
Foo d = (Foo)42; // Okay: calls Foo::Foo(int) passing 42 as an argument
int e = d; // Okay: calls Foo::operator int()
Bar x = 3.14; // Okay: calls Bar::Bar(double) passing 3.14 as an argument
Bar y(3.14); // Okay: calls Bar::Bar(double) passing 3.14 as an argument
Bar z = Bar(3.14); // Okay: calls Bar::Bar(double) passing 3.14 as an argument
Bar w = (Bar)3.14; // Okay: calls Bar::Bar(double) passing 3.14 as an argument
double v = w; // Okay: calls Bar::operator double()
}
But sometimes you want to prevent this sort of implicit promotion or implicit type conversion. For example, if Foo
is really an array-like container and 42 is the initial size, you might want to let your users say, Foo x(42);
or perhaps Foo x = Foo(42);
, but not just Foo x = 42;
. If that’s the case, you should use the explicit
keyword:
class Foo {
public:
explicit Foo(int x);
explicit operator int();
};
class Bar {
public:
explicit Bar(double x);
explicit operator double();
};
void yourCode()
{
Foo a = 42; // Compile-time error: can't convert 42 to an object of type Foo
Foo b(42); // Okay: calls Foo::Foo(int) passing 42 as an argument
Foo c = Foo(42); // Okay: calls Foo::Foo(int) passing 42 as an argument
Foo d = (Foo)42; // Okay: calls Foo::Foo(int) passing 42 as an argument
int e = d; // Compile-time error: can't convert d to an integer
int f = int(d); // Okay: calls Foo::operator int()
Bar x = 3.14; // Compile-time error: can't convert 3.14 to an object of type Bar
Bar y(3.14); // Okay: calls Bar::Bar(double) passing 3.14 as an argument
Bar z = Bar(3.14); // Okay: calls Bar::Bar(double) passing 3.14 as an argument
Bar w = (Bar)3.14; // Okay: calls Bar::Bar(double) passing 3.14 as an argument
double v = w; // Compile-time error: can't convert w to a double
double u = double(w); // Okay: calls Bar::operator double()
}
You can mix explicit
and non-explicit
constructors and conversion operators in the same class. For example, this class has an explicit
constructor taking a bool
but a non-explicit
constructor taking a double
, and can be implicitly converted to double, but only explicitly converted to bool:
#include <iostream>
class Foo {
public:
Foo(double x) { std::cout << "Foo(double)\n"; }
explicit Foo(bool x) { std::cout << "Foo(bool)\n"; }
operator double() { std::cout << "operator double()\n"; }
explicit operator bool() { std::cout << "operator bool()\n"; }
};
void yourCode()
{
Foo a = true; // Okay: implicitly promotes true to (double)1.0, then calls Foo::Foo(double)
Foo b = Foo(true); // Okay: explicitly calls Foo::Foo(bool)
double c = b; // Okay: implicitly calls Foo::operator double()
bool d = b; // Okay: calls Foo::operator double() and implicitly converts to bool
if(b) {} // Okay, explicitly calls Foo::operator bool()
}
The above code will print the following:
Foo(double)
Foo(bool)
operator double()
operator double()
operator bool()
Variable a
is initialized using the Foo(double)
constructor because Foo(bool)
cannot be used in an implicit cast, but true
can be interpreted as a (double)true
, that is, as 1.0
, and implicitly cast to Foo
using Foo::Foo(double)
. This may or may not be what you intended, but this is what happens.
Why doesn’t my constructor work right?
This is a question that comes in many forms. Such as:
- Why does the compiler copy my objects when I don’t want it to?
- How do I turn off copying?
- How do I stop implicit conversions?
- How did my int turn into a complex number?
By default a class is given a copy constructor and a copy assignment that copy all elements, and a move constructor and a move assignment that move all elements. For example:
struct Point {
int x,y;
Point(int xx = 0, int yy = 0) :x(xx), y(yy) { }
};
Point p1(1,2);
Point p2 = p1;
Here we get p2.x==p1.x
and p2.y==p1.y
. That’s often exactly what you want (and essential for C compatibility), but consider:
class Handle {
private:
string name;
X* p;
public:
Handle(string n)
:name(n), p(0) { /* acquire X called "name" and let p point to it */ }
~Handle() { delete p; /* release X called "name" */ }
// ...
};
void f(const string& hh)
{
Handle h1(hh);
Handle h2 = h1; // leads to disaster!
// ...
}
Here, the default copy gives us h2.name==h1.name
and h2.p==h1.p
. This leads to disaster: when we exit f()
the destructors for h1
and h2
are invoked and the object pointed to by h1.p
and h2.p
is deleted twice.
How do we avoid this? The simplest solution is to mark the operations that copy as deleted:
class Handle {
private:
string name;
X* p;
Handle(const Handle&) = delete; // prevent copying
Handle& operator=(const Handle&) = delete;
public:
Handle(string n)
:name(n), p(0) { /* acquire the X called "name" and let p point to it */ }
~Handle() { delete p; /* release X called "name" */ }
// ...
};
void f(const string& hh)
{
Handle h1(hh);
Handle h2 = h1; // error (reported by compiler)
// ...
}
If we need to copy or move, we can of course define the proper initializers and assignments to provide the desired semantics.
Now return to Point
. For Point
the default copy semantics is fine, the problem is the constructor:
struct Point {
int x,y;
Point(int xx = 0, int yy = 0) :x(xx), y(yy) { }
};
void f(Point);
void g()
{
Point orig; // create orig with the default value (0,0)
Point p1(2); // create p1 with the default y-coordinate 0
f(2); // calls Point(2,0);
}
People provide default arguments to get the convenience used for orig
and p1
. Then, some are surprised by the conversion of 2
to Point(2,0)
in the call of f()
. This constructor defines a conversion. By default that’s an implicit conversion. To require such a conversion to be explicit, declare the constructor explicit
:
struct Point {
int x,y;
explicit Point(int xx = 0, int yy = 0) :x(xx), y(yy) { }
};
void f(Point);
void g()
{
Point orig; // create orig with the default value (0,0)
Point p1(2); // create p1 with the default y-coordinate 0
// that's an explicit call of the constructor
f(2); // error (attempted implicit conversion)
Point p2 = 2; // error (attempted implicit conversion)
Point p3 = Point(2); // ok (explicit conversion)
}